Efficiency-conscious propositionalization for relational learning

نویسنده

  • Filip Zelezný
چکیده

Systems aiming at discovering interesting knowledge in data, now commonly called data mining systems, are typically employed in nding patterns in a single relational table. Most of mainstream data mining tools are not applicable in the more challenging task of nding knowledge in structured data represented by a multi-relational database. Although a family of methods known as inductive logic programming have been developed to tackle that challenge by immediate means, the idea of adapting structured data into a simpler form digestible by the wealth of AVL systems has been always tempting to data miners. To this end, we present a method based on constructing rst-order logic features that conducts this kind of conversion, also known as propositionalization. It incorporates some basic principles suggested in previous research and provides signiicant enhancements that lead to remarkable improvements in eeciency of the feature-construction process. We begin by motivating the propositionalization task with an illustrative example, review some previous approaches to propositionalization, and formalize the concept of a rst-order feature elaborating mainly the points that innuence the eeciency of the designed feature-construction algorithm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

EFFICIENCY-CONSCIOUS PROPOSITIONALIZATIONFOR RELATIONAL LEARNING Part Two: Boosting Efficiency

Systems aiming at discovering interesting knowledge in data, now commonly called data mining systems, are typically employed in finding patterns in a single relational table. Most of mainstream data mining tools are not applicable in the more challenging task of finding knowledge in structured data represented by a multi-relational database. Although a family of methods known as inductive logic...

متن کامل

On propositionalization for knowledge discovery in relational databases

Propositionalization is a process that leads from relational data and background knowledge to a single-table representation thereof, which serves as the input to widespread systems for knowledge discovery in databases. Systems for propositionalization thus support the analyst during the usually costly phase of data preparation for data mining. Such systems have been applied for more than 15 yea...

متن کامل

Trading Expressivity for Efficiency in Statistical Relational Learning

Statistical relational learning (SRL) combines state-of-the-art statistical modeling with relational representations. It thereby promises to provide effective machine learning techniques for domains that cannot adequately be described using a propositional representation. Driven by new applications in which data is structured, interrelated, and heterogeneous, this area of machine learning has r...

متن کامل

Ensemble Relational Learning based on Selective Propositionalization

Dealing with structured data needs the use of expressive representation formalisms that, however, puts the problem to deal with the computational complexity of the machine learning process. Furthermore, real world domains require tools able to manage their typical uncertainty. Many statistical relational learning approaches try to deal with these problems by combining the construction of releva...

متن کامل

Propositionalization of Relational Learning: An Information Extraction Case Study

This paper develops a new propositionalization approach for relational learning which allows for efficient representation and learning of relational information using propositional means. We develop a relational representation language, along with a relation generation function that produces features in this language in a data driven way; together, these allow efficient representation of the re...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Kybernetika

دوره 40  شماره 

صفحات  -

تاریخ انتشار 2004